Word Emdeddings through Hellinger PCA
نویسندگان
چکیده
Word embeddings resulting from neural language models have been shown to be a great asset for a large variety of NLP tasks. However, such architecture might be difficult and time-consuming to train. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word cooccurence matrix. We compare those new word embeddings with some well-known embeddings on named entity recognition and movie review tasks and show that we can reach similar or even better performance. Although deep learning is not really necessary for generating good word embeddings, we show that it can provide an easy way to adapt embeddings to specific tasks.
منابع مشابه
Word Embeddings through Hellinger PCA
Word embeddings resulting from neural language models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with the Collobert and Weston (...
متن کاملIs deep learning really necessary for word embeddings?
Word embeddings resulting from neural language models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with some wellknown embeddings ...
متن کاملComparison of Kullback-Leibler, Hellinger and LINEX with Quadratic Loss Function in Bayesian Dynamic Linear Models: Forecasting of Real Price of Oil
In this paper we intend to examine the application of Kullback-Leibler, Hellinger and LINEX loss function in Dynamic Linear Model using the real price of oil for 106 years of data from 1913 to 2018 concerning the asymmetric problem in filtering and forecasting. We use DLM form of the basic Hoteling Model under Quadratic loss function, Kullback-Leibler, Hellinger and LINEX trying to address the ...
متن کاملSimple and Effective Dimensionality Reduction for Word Embeddings
Word embeddings have become the basic building blocks for several natural language processing and information retrieval tasks. Recently, there has been an emphasis on further improving the pre-trained word vectors through post-processing algorithms. One such area of improvement is the dimensionality reduction of word embeddings. Reducing the size of word embeddings through dimensionality reduct...
متن کاملFacilitating text reading in posterior cortical atrophy
OBJECTIVE We report (1) the quantitative investigation of text reading in posterior cortical atrophy (PCA), and (2) the effects of 2 novel software-based reading aids that result in dramatic improvements in the reading ability of patients with PCA. METHODS Reading performance, eye movements, and fixations were assessed in patients with PCA and typical Alzheimer disease and in healthy controls...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1312.5542 شماره
صفحات -
تاریخ انتشار 2013